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the 1990s-era misnomer "|SO-8859-1", see Windows-1252. 
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ISO/IEC 8859-1 code page layout 


MIME / IANA ISO-8859-1 
Alias(es) iso-ir-100, csISOLatin1, latin1, 11, IBM819, CP819 
Language(s) English, various others 
Standard ISO/IEC 8859 
Classification Extended ASCII, ISO/IEC 8859 
Extends US-ASCII 
Based on DEC MCS 
Succeeded by e UTF-8 
e UTF-16 


Other related encoding(s) e ISO/IEC 8859-15 
e Windows-1252 
e BraSCll 


ISO/IEC 8859-1:1998 


ISO/IEC 8859-1:1998, Information technology—8-bit single-byte coded graphic character 
sets—Part 1: Latin alphabet No. 1, is part of the ISO/IEC 8859 series of ASCll-based 
standard character encodings, first edition published in 1987. ISO/IEC 8859-1 encodes 
what it refers to as "Latin alphabet no. 1", consisting of 191 characters from the Latin 
script. This character-encoding scheme is used throughout the Americas, Western 
Europe, Oceania, and much of Africa. It is the basis for some popular 8-bit character sets 
and the first two blocks of characters in Unicode. 
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As of July 2024, 1.2% of all web sites use ISO/IEC 8859-1 Il2I It is the most declared 
single-byte character encoding, but as Web browsers and the HTML5 standard!®! interpret 
them as the superset Windows-1252, these documents may include characters from that 
set. Depending on the country or language, website use can be higher than the global 
average, in Brazil it is at 3.4%,!4l and in Germany at 2.7%. [SIls! 


ISO-8859-1 was (according to the standard, at least) the default encoding of documents 
delivered via HTTP with a MIME type beginning with text/, the default encoding of the 
values of certain descriptive HTTP headers, and defined the repertoire of characters 
allowed in HTML 3.2 documents. It is specified by many other standards. In practice, the 
superset encoding Windows-1252 is the more likely effective default! and it is 
increasingly common for standards to (at least unofficially) default to UTF-8. 


ISO-8859-1 is the IANA preferred name for this standard when supplemented with the CO 
and C1 control codes from ISO/IEC 6429. The following other aliases are registered: iso- 
ir-100, csISOLatin1, latin1, 11, IBM819, Code page 28591 a.k.a. Windows-28591 is 
used for it in Windows. IBM calls it code page 819 or CP819 (CCSID 819) SMO 1112) 
Oracle calls it WE8ISO8859P1 ./13) 


Coverage 


See also: Latin-script alphabet 

Each character is encoded as a single eight-bit code value. These code values can be 
used in almost any data interchange system to communicate in the following languages 
(while it may exclude correct quotation marks such as for many languages including 
German and Icelandic): 


Modern languages with complete coverage 


Afrikaans 
e Albanian 

e Basque 

e Breton 

e Corsican 
e English 

e Faroese 

e Galician 

e Icelandic 
e Ido 

e Irish 

e Indonesian 
e Italian 

e Leonese 

e Lojban 

e Luxembourgish!@! 
e Malay!®l 
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e Manx 

e Norwegian|! 

e Occitan 

¢ Portuguese! 
e Rhaeto-Romanic 
e Rotokas 

e Scottish Gaelic 
e Scots 

e Southern Sami 
e Spanish 

e Swahili 

e Swedish 

¢ Tagalog 

e Toki Pona 

e Walloon 


Notes 


1. 8 Basic classical orthography 
2. * Rumi script 

3. * Bokmal and Nynorsk 

4. * European and Brazilian 


Languages with incomplete coverage 


ISO-8859-1 was commonly used for certain languages, even though it lacks characters 
used by these languages. In most cases, only a few letters are missing or they are rarely 
used, and they can be replaced with characters that are in ISO-8859-1 using some form 


Supported 
Language Missing characters Typical workaround by 
Catalan L, | (deprecated) Lexie 
Danish ©, 6 (the accent is @, @ or ge 
optional and @ is very 
rare) 
Dutch lJ, ij (debatable); j in digraphs lJ, ij or y; blijf 
emphasized words 
like “blijf" 
Estonian, S, §, Z, Z (only present Sh, sh, Zh, zh ISO-8859- 
Finnish in loanwords) 1S; 
Windows- 
1252 
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French CE, ce, and the very digraphs OE, oe; Y or Y ISO-8859- 
rare Y 15, 
Windows- 
1252 
German B (capital 8, used only digraph SS or SZ 
in all capitals) 
Hungarian O,6,U,G O, 6, U, t ISO-8859-2, 
Q, 6, U, a (the characters Windows- 
replaced in 8859-2) 1250 
Irish B, b, C, Cc. D, dJF Bh, bh, Ch, ch, Dh, dh, Fh, fh, ISO-8859-14 
(traditional G, 9, M,m,P,p,S,s, Gh, gh, Mh, mh, Ph, ph, Sh, sh, 
orthography) T, t Th, th 
Welsh W, Ww, W, W, W, w,W, Ww,Yy, Y,y ISO-8859-14 
We Yi WG Yo 


The letter y, which appears in French only very rarely, mainly in city names such as 
L'Hay-les-Roses and never at the beginning of words, is included only in lowercase form. 
The slot corresponding to its uppercase form is occupied by the lowercase letter & from 
the German language, which did not have an uppercase form at the time when the 
standard was created. 


Quotation marks 


For some languages listed above, the correct typographical quotation marks are missing, 


as only « »," ",and' ' are included. Also, this scheme does not provide for oriented 
(6- or 9-shaped) single or double quotation marks. Some fonts will display the spacing 
grave accent (0x60) and the apostrophe (0x27) as a matching pair of oriented single 


considered part of the modern standard. 


History 


ISO 8859-1 was based on the Multinational Character Set (MCS) used by Digital 
Equipment Corporation (DEC) in the popular VT220 terminal in 1983. It was developed 
within the European Computer Manufacturers Association (ECMA), and published in 
March 1985 as ECMA-94,|"41 by which name it is still sometimes known. The second 
edition of ECMA-94 (June 1986)! also included ISO 8859-2, ISO 8859-3, and 

ISO 8859-4 as part of the specification. 


The original draft of ISO 8859-1 placed French CE and oe at code points 215 (0xD7) and 
247 (OxF7), as in the MCS. However, the delegate from France, being neither a linguist 
nor a typographer, falsely stated that these are not independent French letters on their 
own, but mere ligatures (like fi or ff), supported by the delegate team from Bull Publishing 
Company, who regularly did not print French with Q£/oe in their house style at the time. An 
anglophone delegate from Canada insisted on retaining Q£/ce but was rebuffed by the 
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French delegate and the team from Bull. These code points were soon filled with x and + 
under the suggestion of the German delegation. Support for French was further reduced 

when it was again falsely stated that the letter y is "not French", resulting in the absence 

of the capital Y. In fact, the letter y is found in a number of French proper names, and the 
capital letter has been used in dictionaries and encyclopedias.“& These characters were 
added to ISO/IEC 8859-15:1999. BraSCll matches the original draft. 


In 1985, Commodore adopted ECMA-94 for its new AmigaOS operating system.“ The 
Seikosha MP-1300Al impact dot-matrix printer, used with the Amiga 1000, included this 
encoding. 


In 1990, the first version of Unicode used the code points of ISO-8859-1 as the first 256 
Unicode code points. 


In 1992, the IANA registered the character map ISO_8859-1:1987, more commonly 
known by its preferred MIME name of ISO-8859-1 (note the extra hyphen over ISO 8859- 
1), a superset of ISO 8859-1, for use on the Internet. This map assigns the CO and C1 
control codes to the unassigned code values thus provides for 256 characters via every 
possible 8-bit value. 


Code page layout 
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Undefined 


Symbols and punctuation 


Undefined in the first release of ECMA-94 (1985).!“4! In the original draft CE was at 
OxD7 and ce was at OxF7. 


ISO/IEC 8859-1 


Similar character sets 


Main article: Western Latin character sets (computing) 


ISO/IEC 8859-15 


ISO/IEC 8859-15 was developed in 1999, as an update of ISO/IEC 8859-1. It provides 
some characters for French and Finnish text and the euro sign, which are missing from 
ISO/IEC 8859-1. This required the removal of some infrequently used characters from 
ISO/IEC 8859-1, including fraction symbols and letter-free diacritics: 5, |, °, °,,,%, %, and 
%. Ironically, three of the newly added characters (€, «, and Y) had already been present in 
DEC's 1983 Multinational Character Set (MCS), the predecessor to ISO/IEC 8859-1 
(1987). Since their original code points were now reused for other purposes, the 
characters had to be reintroduced under different, less logical code points. 


ISO-IR-204, a more minor modification (called code page 61235 by FreeDOS),"48] had 
been registered in 1998, altering ISO-8859-1 by replacing the universal currency sign (#) 
with the euro sign!42! (the same substitution made by ISO-8859-15). 


Windows-1252 


The popular Windows-1252 character set adds all the missing characters provided by 
ISO/IEC 8859-15, plus a number of typographic symbols, by replacing the rarely used C1 
controls in the range 128 to 159 (hex 80 to 9F). It is very common to mislabel Windows- 
1252 text as being in ISO-8859-1. A common result was that all the quotes and 
apostrophes (produced by "smart quotes" in word-processing software) were replaced 
with question marks or boxes on non-Windows operating systems, making text difficult to 
read. Many Web browsers and e-mail clients will interpret |SO-8859-1 control codes as 
Windows-1252 characters, and that behavior was later standardized in HTML5.!20l 


Mac Roman 


The Apple Macintosh computer introduced a character encoding called Mac Roman in 
1984. It was meant to be suitable for Western European desktop publishing. It is a 
superset of ASCII, and has most of the characters that are in ISO-8859-1 and all the extra 
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characters from Windows-1252, but in a totally different arrangement. The few printable 
characters that are in ISO/IEC 8859-1, but not in this set, are often a source of trouble 
when editing text on Web sites using older Macintosh browsers, including the last version 
of Internet Explorer for Mac. 


Other 


DOS has code page 850, which has all printable characters that ISO-8859-1 has, albeit in 
a totally different arrangement, plus the most widely used graphic characters from code 
page 437. 


Between 1989/24] and 2015, Hewlett-Packard used another superset of ISO-8859-1 on 
many of their calculators. This proprietary character set was sometimes referred to simply 
as "ECMA-94" as well.'24] HP also has code page 1053, which adds the medium shade 

|| U+2592) at Ox7F.22! 


Several EBCDIC code pages were purposely designed to have the same set of 
characters as ISO-8859-1, to allow easy conversion between them. 


See also 
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Early telecommunications 


ISO/IEC 8859 


Telegraph code 
o Needle 
o Morse 
» Non-Latin 
» Wabun/Kana 
=» Chinese 
=» Cyrillic 
o Baudot and Murray 
Fieldata 
ASCII 
ISO/IEC 646 
BCDIC 
Teletex and Videotex/Teletext 
o T5S1/ISONEC 6937 
ITU T.61 
ITU T.101 
World System Teletext 
» background 
» sets 
Transcode 


o 0 0 


re, parts 

-1 (Western Europe) 
-2 (Central Europe) 
Maltese/Esperanto) 
North Europe) 
Cyrillic) 


(es) 


i 


1 1 
D|o| 
> 

S 

a 

joy 
Ue 


®) 
u 
© 
fo) 
Meat 


1 
N 


1 
Co 
r 
(2) 
oa 
= 
)) 

& 


ko ffl a] op 


-15 (New Western Europe) 
-16 (Romanian) 
Abandoned parts 
-12 (Devanagari) 

Proposed but not approved 

o KOI-8 Cyrillic 

o Sami 
Adaptations 

o Welsh 

o Barents Cyrillic 

o Estonian 

o Ukrainian Cyrillic 


o 0000 0 0000 00 0 0 
1 
K<e) 
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Bibliographic use 


National standards 


ISO/IEC 2022 


MARC-8 
o ANSEL 
o0 CCCII/EACC 


CNS 11643 
DIN 66003 
ELOT 927 
GOST 10859 


ISCIl 
JIS X 0201 
JIS X 0208 
JIS X 0212 
JIS X 0213 
KOI-7 

KPS 9566 
KS X 1001 
KS X 1002 
LST 1564 
LST 1590-4 


ISO/IEC 8859 
ISO/IEC 10367 
Extended Unix Code / EUC 
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Mac OS Code pages 


("scripts") 


DOS code pages 


Armenian 
Arabic 

Barents Cyrillic 
Celtic 

Central European 
Croatian 
Cyrillic 
Devanagari 
Farsi (Persian) 
Font X (Kermit) 
Gaelic 
Georgian 
Greek 

Gujarati 
Gurmukhi 
Hebrew 
Iceland 

Inuit 

Keyboard 

Latin (Kermit) 
Maltese/Esperanto 
Ogham 

Roman 


Turkic Cyrillic 
Ukrainian 
V1T100 
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CWI-2 
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IBM AIX code pages 


oeeeoeeeee#eeeoe8 ®# @# @® © @® @® ® @® @® @ 
—_ 
—_ 
NO 


Windows code pages 
932 


1270 
Cyrillic + Finnish 
Cyrillic + French 
Cyrillic + German 
Polytonic Greek 


EBCDIC code pages Japanese language in EBCDIC 


DKOI 
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DEC terminals (VTx) e Multinational (MCS) 
e National Replacement (NRCS) 
o French Canadian 
Swiss 
Spanish 
United Kingdom 
Dutch 
Finnish 
French 
Norwegian and Danish 
Swedish 
o Norwegian and Danish (alternative) 
8-bit Greek 
8-bit Turkish 
S| 960 
Hebrew 
Special Graphics 
Technical (TCS) 


o O00 0 0 0 0 
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Platform specific 


BICS 


DEC RADIX 50 
DEC MCS/NRCS 
DG International 
Galaksija 

GEM 

GSM 03.38 

HP Roman 

HP FOCAL 

HP RPL 
SQUOZE 

LICS 

LMBCS 

MSX 

NEC APC 

NeXT 

PETSCIl 
PostScript Standard 
PostScript Latin 1 
SAM Coupé 
Sega SC-3000 
Sharp calculators 
Sharp MZ 
Sinclair QL 
Teletext 

TI calculators 
TRS-80 

Ventura International 
WISCII 

XCCS 

ZX80 

ZX81 

ZX Spectrum 
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Unicode / ISO/IEC 10646 


TeX typesetting system 


Miscellaneous code pages 


Control character 


TACE16 


ABICOMP 

ASMO 449 

Digital encoding of APL symbols 
ISO-IR-68 

ARIB STD-B24 


o 8-bit 
ISO-IR-169 
ISO 2033 
KOI 

o KOI8-R 

o KOI8-RU 

o KOI8-U 
Mojikyo 
SEASCII 
Stanford/ITS 
Symbol 
TRON 
Unified Hangul Code 


Morse prosigns 
CO and C1 control codes 
°o ISO/IEC 6429 
o JIS X 0211 
Unicode control, format and separator characters 
Whitespace characters 
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Related topics CCSID 

Character encodings in HTML 
Charset detection 

Han unification 

Hardware code page 

MICR code 

Mojibake 

Variable-length encoding 


@ Character sets 
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